18 research outputs found
Driver Distraction Identification with an Ensemble of Convolutional Neural Networks
The World Health Organization (WHO) reported 1.25 million deaths yearly due
to road traffic accidents worldwide and the number has been continuously
increasing over the last few years. Nearly fifth of these accidents are caused
by distracted drivers. Existing work of distracted driver detection is
concerned with a small set of distractions (mostly, cell phone usage).
Unreliable ad-hoc methods are often used.In this paper, we present the first
publicly available dataset for driver distraction identification with more
distraction postures than existing alternatives. In addition, we propose a
reliable deep learning-based solution that achieves a 90% accuracy. The system
consists of a genetically-weighted ensemble of convolutional neural networks,
we show that a weighted ensemble of classifiers using a genetic algorithm
yields in a better classification confidence. We also study the effect of
different visual elements in distraction detection by means of face and hand
localizations, and skin segmentation. Finally, we present a thinned version of
our ensemble that could achieve 84.64% classification accuracy and operate in a
real-time environment.Comment: arXiv admin note: substantial text overlap with arXiv:1706.0949
Dynamic Conditional Imitation Learning for Autonomous Driving
Conditional imitation learning (CIL) trains deep neural networks, in an
end-to-end manner, to mimic human driving. This approach has demonstrated
suitable vehicle control when following roads, avoiding obstacles, or taking
specific turns at intersections to reach a destination. Unfortunately,
performance dramatically decreases when deployed to unseen environments and is
inconsistent against varying weather conditions. Most importantly, the current
CIL fails to avoid static road blockages. In this work, we propose a solution
to those deficiencies. First, we fuse the laser scanner with the regular camera
streams, at the features level, to overcome the generalization and consistency
challenges. Second, we introduce a new efficient Occupancy Grid Mapping (OGM)
method along with new algorithms for road blockages avoidance and global route
planning. Consequently, our proposed method dynamically detects partial and
full road blockages, and guides the controlled vehicle to another route to
reach the destination. Following the original CIL work, we demonstrated the
effectiveness of our proposal on CARLA simulator urban driving benchmark. Our
experiments showed that our model improved consistency against weather
conditions by four times and autonomous driving success rate generalization by
52%. Furthermore, our global route planner improved the driving success rate by
37%. Our proposed road blockages avoidance algorithm improved the driving
success rate by 27%. Finally, the average kilometers traveled before a
collision with a static object increased by 1.5 times. The main source code can
be reached at https://heshameraqi.github.io/dynamic_cil_autonomous_driving.Comment: 14 pages, 11 figures, 7 table
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models
In this work, we propose a technique to transfer speech recognition
capabilities from audio speech recognition systems to visual speech
recognizers, where our goal is to utilize audio data during lipreading model
training. Impressive progress in the domain of speech recognition has been
exhibited by audio and audio-visual systems. Nevertheless, there is still much
to be explored with regards to visual speech recognition systems due to the
visual ambiguity of some phonemes. To this end, the development of visual
speech recognition models is crucial given the instability of audio models. The
main contributions of this work are i) building on recent state-of-the-art
word-based lipreading models by integrating sequence-level and frame-level
Knowledge Distillation (KD) to their systems; ii) leveraging audio data during
training visual models, a feat which has not been utilized in prior word-based
work; iii) proposing the Gaussian-shaped averaging in frame-level KD, as an
efficient technique that aids the model in distilling knowledge at the sequence
model encoder. This work proposes a novel and competitive architecture for
lip-reading, as we demonstrate a noticeable improvement in performance, setting
a new benchmark equals to 88.64% on the LRW dataset.Comment: arXiv admin note: text overlap with arXiv:2108.0354
A New Efficient Graphemes Segmentation Technique for Offline Arabic Handwriting
[abstract not available
On-line Arabic Handwritten Personal Names Recognition System Based on HMM
[abstract not available
Autonomous driving in the face of unconventional odds
Autonomous driving (AD) will play a vital role in saving human lives and substantial property damage, with approximately 90% of accidents occuring due to human errors. AD promises greater mobility, energy saving, and less air pollution. Despite the recent advances to achieve such promising vision, enabling autonomous vehicles in complex environments remains a challenge. The solution to improve road infrastructure should start by deploying Road Asset Management Systems (RAMS) to structurally plan and implement maintenance. In low-income countries, RAMS data collection process should be more frequent due to the low-cost material used, and so it is more costly